TGC-Tree: An Online Algorithm Tracing Closed Itemset and Transaction Set Simultaneously

نویسندگان

  • Junbo Chen
  • Bo Zhou
چکیده

Finding Association Rules is a classical data mining task. The most critical part of Association Rules Mining is finding the frequent itemsets in the database. Since the introduce of the famouse Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among all the algorithms, the approach of mining closed itemsets has arisen a lot of interests in data mining community, because the closed itemsets are the condensed representation of all the frequent itemsets. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [15], etc. While the above algorithms are trying to improve the performance of finding the Intents of Formal Concepts (in anther word, the closed itemsets), they missed another important information: the Extents of Formal Concepts. In this paper, we propose an online algorithm, TGC-Tree, which is adapted from the GC-Tree algorithm [15], that could be used to trace the closed itemsets(Intents) and the corresponding transaction sets(Extents) simultaneously in an incremental way.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient algorithm for mining closed inter-transaction itemsets

In this paper, we propose an efficient algorithm, called ICMiner (Inter-transaction Closed patterns Miner), for mining closed inter-transaction itemsets. Our proposed algorithm consists of two phases. First, we scan the database once to find the frequent items. For each frequent item found, the ICMiner converts the original transaction database into a set of domain attributes, called a dataset....

متن کامل

Optimization Of Intersecting Algorithm For Transactions Of Closed Frequent Item Sets In Data Mining

Data mining is the computer-assisted process of information analysis. Mining frequent itemsets is a fundamental task in data mining. Unfortunately the number of frequent itemsets describing the data is often too large to comprehend. This problem has been attacked by condensed representations of frequent itemsets that are sub collections of frequent itemsets containing only the frequent itemsets...

متن کامل

Clohui: an Efficient Algorithm for Mining Closed

High-utility itemset mining (HUIM) is an important research topic in data mining field and extensive algorithms have been proposed. However, existing methods for HUIM present too many high-utility itemsets (HUIs), which reduces not only efficiency but also effectiveness of mining since users have to sift through a large number of HUIs to find useful ones. Recently a new representation, closed +...

متن کامل

DisClose : discovering colossal closed itemsets from high dimensional datasets via a compact row-tree

Data mining is an essential part of knowledge discovery, and performs the extraction of useful information from a collection of data, so as to assist human beings in making necessary decisions. This thesis describes research in the field of itemset mining, which performs the extraction of a set of items that occur together in a dataset, based on a user specified threshold. Recent focus of items...

متن کامل

An Accelerator for Frequent Itemset Mining from Data Streams with Parallel Item Tree

Frequent itemset mining attempts to find frequent subsets in a transaction database. In this era of big data, demand for frequent itemset mining is increasing. Therefore, the combination of fast implementation and low memory consumption, especially for stream data, is needed. In response to this, we optimize an online algorithm, called Skip LC-SS algorithm [1], for hardware. In this paper, we p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008